home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Cream of the Crop 26
/
Cream of the Crop 26.iso
/
program
/
ccdl150e.zip
/
CC386.DOC
< prev
next >
Wrap
Text File
|
1997-06-14
|
43KB
|
1,267 lines
1.1) General introduction
1.2) Liability and authorship
1.3) General setup issues
1.4) Command line switches
1.5) Credits
2.1) ANSI compatability
2.2) Missing features
2.3) C++ style features
2.4) Known bugs
3.1) Standard run-time libraries.
3.2) DOS libraries
4.1) Implementation-dependent keywords
4.2) Implementation-dependent preprocessor functions
4.3) Phi-text compatability
5.1) Errors
6.1) Stack frames
6.2) ASM interface
6.3) Segmentation
6.4) Optimizations
7.1) Description of directory tree
7.2) Porting
1.1) General introduction
CC386 is a generic 386 DOS C compiler. Every effort has been made to
have it recognize standard ANSI syntax; however it should NOT be
expected to produce code conforming to the ansi standards, especially
in regard to floating point. CC386 outputs assembly language code
suitable for TASM and NASM, and possibly it will work with MASM.
This package includes various support programs and libraries required
to build code that will run under an MSDOS DPMI server. TRAN's PMODE
is used for the server; so if there is no memory management software
programs generated by this package will still work.
You need TASM and TLINK to use this package. You can get by with
WLINK; however TASM is still a requirement. The compiler itself will
generate NASM compatabile code, however this package is not sufficient
to have that code actually run under DOS.
This package consists of the compiler, some borland DPMI stubs
currently needed for the compiler, run-time libraries for DOS, and
header files. Seperate packages have the source for the compiler and
the run-time libraries.
Additionally, the package includes a program 'CL386' which will
call the compiler, TASM, and TLINK to create programs for you.
Warning: you must have a version of TASM earlier than version 4.1;
TASM 4.1 has some bugs. The code was tested with 4.0; it will probably
work with 3.0 and maybe even 2.0.
Part of the run-time libraries is a debug-style debugger that can be
linked into the code. Note that this debugger will NOT work inside
a win95 DOS box; however the rest of the package will.
New features and bug fixes:
revamp of floating point to make it at least work a little. Now that I
have a coprocessor :).
exception handling that works, including a floating point exception
handler that will catch it if you use floating point when there is
no coprocessor.
revamp of all the implicit cast operations to make them work properly
fix to code generation for pointers to functions
fix to the bit-size variables used in structures
inline assembler recognizes ALL 486 opcodes (32-bit addressing modes only)
various fixes to the debugger
addition of SPAWN functions to the run-time library
This program IS capable of compiling itself and having the image run;
I do not distribute this latter image because it will not run on a 386
when there is no FP coprocessor.
1.2) Liability and authorship
This compiler is presented on an 'as-is' basis without any guarantee of
usability or fitness for any given application. Risks associated with
using it, including financial loss or loss of life are not the
responsibility of the authors. However, this compiler is intended as
an educational tool, and is not to be used commercially in any case
without the express written consent of the authors.
The original author is Matthew Brandt. As he left it it was a K&R
style compiler with no floating point and minimal preprocessor support,
targeted only for the m68k. Much of the work was done on a Unix
machine and later ported to DOS. You can find his version on one of
the Motorola file sites if you wish to compare. The current version
has been updated extensively to support a variety of ANSI constructs as
well as i386 support and a better preprocessor. However, parts of
the program still reflect Mathew's work.
I have done my part of the coding with 16 and 32 bit MSDOS compilers.
This version of the code has NO dos-specific features in it and should
be portable to any 16 or 32 bit ANSI compiler.
1.3) General setup issues
To install the version which creates DOS executables, run the install.bat
file. You need to give it a file name:
install C:
will install the necessary files on drive C:. A directory tree will
be made under \CC386 and all necessary files will be copied there.
Read the file INTRO.DOS for a brief overview of how to create programs
for DOS.
The only thing really needed for the compileer to work is a pointer
to the include directories. These may be specified in the environmnt
variable 'CCINCL' or with the /I command line switch. I normally set
CCINCL=\cc386\include to get at the ansi headers and then use /I if any
other directories are required. If you use the install.bat program
CL386 will take care of the include and library issues via its configuration
file and there is no other setup required other than to run the install
program.
1.4) Command line switches
Switches prefixed with a '+' or '-' may be turned on orr off. The
last occurance of the switch determines the state. For these switches
'/' is equivalent to '-'. Note that codegen parameters must generally
be the same for all modules in a program, or unpredictable results will
occur.
+e - make error file
+i - make preprocessed file
default is -i
/ffile - process arguments in file 'file'
+l - make LST file
default is -l
/w-all - no warnings, errors only
warnings may also be suppressed individually. See ERROR.DOC
-A - disable ANSI compatability and enable some non-standard features
default is +A
/C - codegen params
/C+d - display internal diagnostics
/C-b - no BSS
/C-l - don't put C source in the ASM file
/C-m - don't mangle symbols with a leading underscore
/C+p - pack variables for space. On a 68020+ minimize at word
alignment
/C+r - reverse order of bit ops
/C+F - (386) force the TASM .MODEL directive to use FLAT mode
This may become the default in some future release.
/C+N - generate NASM code
/C-R - use stack pointer rather than link registers
default is /C+blmR-prFN
/Dxxx - define a macro 'xxx'
/E## - max number of errors to generate
/Idirs - specify include directories. use a semicolon to seperate multiple
directory specifications. The directories specified by the
environment variable "CCINCL" are always searched first.
/O - Optimizer params
/O-Rxxx Turn off register optimizations. In place of the xxx
put any combination of:
a - turn off address rigister optimizations
f - turn off floating point register optimizations
d - turn off data register optimizations
default is all register optimizations enabled
+S - reserved, no use (yet)
default is -S
Compiler will look for the symbol CC386 (or CC68K) in the environment.
If it finds it, it will evaluate any command line arguments in it
prior to evaluating the command line. Note that command line
parameters will override the environment variable; in particular
specifying a search path both in the environment var and on the command
line will result in loss of the search-path environment. There is an
alternate environment variable CCINCL which specifies include paths
which will be appended to the command line specification.
1.5) Credits
The following people contributed source code to this program.
Matthew Brandt: original K&R C compiler
Thomas Pytel (TRAN): DPMI extender for DOS
Kirill Joss: CL386 compiler shell
David Lindauer: Ansification, preprocessor, run-time libraries,
386 code gen, miscellaneous enhancements to original compiler
Many people were instrumental in locating bugs, I'd like to acknowledge
two who were especially helpful with *lots* of testing:
Johann Klockars
Kirill Joss
And thanks to David Gurevich and Kirill Joss for helpful suggestions on
packaging.
2.1) ANSI compatability
This compiler is meant to be ANSI compatible at the source level. However
I have never seen the ANSI documentation for what that means; If
you find something it doesn't do, let me know!
However, there is no guarantee that code generated will meet ANSI
runtime requirements in terms of evaluation ordering, especially with
casts. floating point is done using the host coprocessor , and is NOT
adjusted for ANSI/IEEE compatability.
The run-time library is designed to act like an ANSI library; however
the internals are most likely somewhat different. Especially since I
did away with static buffers where possible.
2.2) Missing features
The following are known to be missing:
a) libraries don't handle any kind of floating point
b) expressions of the form:
(T)
are not handled correctly when T is a typedef
2.3) C++ style features
The C compiler has some rudimentary C++ support. It recognizes:
1) Overloaded functions (but not the overload keyword)
2) Variable declarations anywhere
3) Reference variables
4) Function parameter defaults
5) Stricter type checking
6) Improved init of static pointers and reference variables
7) Detailed C++ error messages
classes and C++ keywords aren't yet supported.
To enable these features use the extension .CPP on your input file
2.4) Known bugs
The following known bugs exist:
a) Expression evaluation is recursive. With a 4K compiler stack
the limit is approximately something like:
a = (b()+(c()+(d()+(e()+f()))));
Beyond this unpredictable results will occur. Raise the stack limit
or rearrange the expression with higher order parenthesis to the left.
Notice this would not be a problem without the grouping parenthesis
because the compiler wouldn't have to maintain so many contexts. I
compile the compiler with a 20K stack
b) Floating point may or may not work. A floating point library will
be added later and this will be checked out
d) expressions such as :
a = b = c;
may not return the correct value to anything other than the rightmost
assignment. In general it will work, but if there are multiple
implicit casts going on from one assigment to the next it may not work
correctly.
e) % may not work properly for signed divisions. The sign may be
wrong but the value will be correct. This may or may not be a problem,
I haven't analyzed it.
f) long and unsigned constants will not be optimized or evaluated
correctly when there are two or more of them in an expression (type may
not propogate)
g) The identifier 'pascal' is a standard keyword rather than something
a user may redefine.
3.1) Standard run-time libraries.
libs were implemented according to 'The Waite Group's Essential Guide
to ANSI C'- ISbN 0-672-22673-1. My copy is circa 1989.
Floating point library functions are not supported at this time, as I
have no way to test them. This includes things like atof and difftime
as well as most of the math libraries.
All the functions in this book were implmented except floating point.
However, process control stuff is kind of sketchy at this time. Most
of the IO library, part of the time library, and the malloc library
functions require operating system support. Documentation for this
is provided with the run-time library sources; however this
package contains sufficient code to use DPMI as the operating system.
There may be a variety of cases where things don't work as expected.
For example scanf will only read one line no matter what... when a
function such as strftime requires a buffer length to be given the
results are undefined if the text length exceeds the buffer length.
Also I just found out the opening a file with the 'a' attribute is
supposed to override any attempt to set the position for write in the
file... in this implementation all it does is position to the end of
the file at open time.
The libraries were originally designed in a reentrant fashion; however
this breaks much standard code and the version of the libraries
included here has static buffers where called for.
ERRNO isn't supported at this time.
Many of the library functions depend on having the startup/rundown code
included. This code initializes a few global variables and executes
any startup/rundown functions the libraries need for initialization
and cleanup.
To use the libraries two files must be included in your link. First is
the startup module (c0dos or c0dosd) and second is the library itself (cldos).
An example build if _main is defined in q.c:
cc386 q.c
tasm /ml /m2 q.c
tlink c0dos q,q,q,cldos
will build q.exe. Note that the startup module MUST be the first object
module specified as it defines the segmentation setup required.
Two startup modules are provided; c0dos is a standard C startup module.
c0dosd is the same module but it will draw in a debugger from the library
(approximately 16K) and call it rather than execute your code. The debugger
is somewhat similar to DEBUG. When the debugger starts up the EIP and
registers will be set to the values they would have at the beginning
of your _main function. The debugger traps several exceptions including
int 3 so you can put int 3 in your code at places you want to debug.
Warning! The debugger will NOT work in a DOS box in windows 95, as I
could not get appropriate access to exceptions.
The startup modules use TRAN's PMODE to manage pmode resources. I
use version 3.07... I had to modify the class names in his segment
declarations to make them different from the 32-bit code segments but other
than that they are his release. I have included the sources as per his
licensing in msdos\pmode307.
I manage several exceptions; traps 6,13, and 14 are all routed through
the signal-handling code; by default the print general protection fault and
exit but you can trap them using the signal mechanism if you want.
Unless you are in a DOS box... likewise traps 7, 8, and 16 relating
to floating point are routed through the signal handling code unless
you are in a DOS box. The default signal handling code just prints a
message and jumps to the program exit point...
I also manage ctrl-c interrupt from DOS (but not ctrl-brk from BIOS, that
is handled via the DOS interrupt) and exit the program cleanly if ctrl-c
is pressed. Note that ctrl-c will even exit the debugger! I should
probably fix that...
4.1) Implementation-dependent keywords
a) The following implementation-dependent keywords have been added
i386 use
_interrupt Generate a function which may be used
as a trap/interrupt.
_genbyte Generate data in the code segment
_absolute Allocate a global variable at
an absolute address. Such variables
will be directly addressable.
pascal force the function declaration to use
pascal calling conventions.
b) The following implementation variables have been added. these
variables directly access the assembly language registers they name.
Note they should be used with caution and may change periodically at
the compiler's discretion. Also, casts of them or assignements to
them may change the machine state functionally... and wreck the
code the compiler has generated.
i386
_EAX _EBX _ECX _EDX _ESP _EBP _ESI _EDI
the 386 compiler also knows the keyword 'asm' which is an escape to
allow inline assembly. The syntax is:
asm my_instruction;
or
asm {
my list of instructions;
}
The compiler catches most errors in inline assembly code at compile
time. It will also translate the names of local variables into
proper stack-based addressing modes.
4.2) Implementation-dependent preprocessor functions
the preprocessor is more-or less ansi compatible.
The following #pragma statements are supported:
#pragma regopt xxx
now obsolete
#pragma startup xxx #
xxx may be any function name
# may be a priority value from 20 - 90 (other values are used
by the run-time library) Higher priority functions get run
first.
This option tells the compiler to inform the startup routines
that this function should be run prior to calling main.
#pragma rundown xxx #
xxx may be any functionn name
# may be an integer value from 20-90 (other values are used by
the run-time library). Higher priority functions get run
first.
This option tells the compiler to inform the startup routines
that this function should be run after main exits.
The following macros are predefined:
_i386_ (386 only) compiler is generating 386 code
_m68k_ (68k only) compiler is generating 68K code
__cplusplus if the compiler is allowing C++ extensions
__FILE__ the file name of the source file as a string
__DATE__ The date as a string
__TIME__ The time as a string
__LINE__ the line number as a number
#if macros can use defined(xxx) to determine if a macro is defined.
4.3) Phi-text compatability
This compiler is capable of understanding 'phi-text' which is an
extended text-based character set. It is somewhat preferable to
UNICODE for western programmers as it does not encompass thousands of
characters that are little used by main-stream westerners.
Phi text is a banked character set. Each character in its full
form is 32-bits; this encompasses the following information:
cwb: a number from 32 to 127 describing the character
bank: a bank from 0 to 15. BANK 0 is the ASCII character set with
some modifications to control characters.
basic Attributes: BOLD, UNDERLINE, ITALIC, HIDDEN, and REVERSED
attributes. BLINKING may be substituted for ITALIC however we
normally use ITALIC.
color: 16 color renditions. The colors have been chosen to reflect
complementary colors. Foreground and background may be
specified for each character.
size 16 size attribute
font: 16 font attribute
32 bits per character is a bit much for some applications; an
application may elect to ignore certain fields. This compiler ignores
ALL fields but the bank and the CWB (although it may look briefly
at attributes, I don't remember). Internally, the characters are thus
represented with 16-bit fields in this compiler.
To ease the storage requirements of such a character set, there exists
a 'streamed' form of phi-text. This takes advantage of the notion that
attributes are not likely to change as rapidly as the character
information. Basically, if the high bit of a streamed byte is set
it indicates that control information is embedded which indicates the
new attributes. There is also a 'repeat' code so that long strings
of repeating characters (for example spaces) get packed together. This
is not quite as efficient as tabbing but in the long run it works out
better because there are several situations where such strings may
occur in phi-text and they do NOT always involve spaces. In this
compiler, the incoming text is in streamed format (which defaults to
ASCII unless an appropriate editor is used). The streamed format is
converted to a flat format and all information is stripped except that
essential to detecting the character. Preprocessing is done on the
flat version... but when the scanner starts looking for tokens it then
converts the flat version back to stream (minus colors and attributes)
for more effeciency in the parser and back end. If the source file
is streamed phi-text the list and assembly files will also be streamed
phi-text; color information is added to the list file just to
make it a little flashy, although I have a monochrome monitor so the
colors I picked may be awkward.
One problem exists with streamed formats: in case of an error situation
it is possible to lose important synchronization and so wreck more than
a single character. For this reason streamed phi-text is partially
synchronizing; at the beginning of each line all attribute information
defaults back to a standard default. In this way one never loses more
than a complete line in the presence of simple errors. And even at
that a smart editor could be designed to help one recover from such
simple errors... provided that errors occurred often enough to be worth
the effort.
We have seen that phi-text is composed of 16 banks with 96 characters
per bank, for a total of 1536 characters. The first bank is pure
ASCII, with a few modifications, but what are the other banks?
About half of them are currently unused. Of those some have been
deliberately reserved for application speccific and system-specific
uses by the designer of phi-text. The defined characters can be broken
roughly into the following groups:
1) ASCII characters
2) European extensions (accented characters)
3) greek characters
4) cyrillic characters
5) line drawing characters
6) mathematics characters
7) miscellaneous characters
While it IS possible to extend certain C operators with more compact
character representations in a compiler like this one, use of phi-text
has been limited to allowing greek and cyrillic characters in variable
names, and to allowing things in boxes to be treated as comments. The
primary editor for phi-text has an extension that allows usage of the
arrow keys to draw lines on the screen and this makes beautifying code
a snap.
For more information about phi-text, contact:
Paul McKneely
P.O. BOX 5641
Pasadena, TX, 77508
email: gecko@onramp.net
5.1) Errors
This is a list of possible errors. There are two types of errors... 'Errors'
and 'Warnings'. an 'Error' signifies an event which the compiler cannot
handle, whereas a 'warning' is a diagnostic which indicates that something
is possibly wrong but the compiler will make assumptions about it.
This list is slightly outdated; it is missing new errors which the
inline assembler can generate.
Each 'warning' will have a value in parenthesis, this value may be used
on the command line to supress the warning. the value 'all' may be used
to supress all warnings. Errors may not be suppressed.
Example
cc -w-ieq a.c ; Suppress the 'Possibly incorrect assignment' warning.
cc -w-all a.c ; Suppress ALL warnings
Some of these errors result when the compiler is in C++ mode.
Error: _int keyword not allowed in Pascal declarations
Pascal declarations may not be used as traps or interrupts.
Error: Ambiguity between %s and %s
C++. Compiler cannot choose between two almost equivalent
overloaded definitions.
Warning: ('cln') Argument list too long %s
Argument list for the function call specified is too long. Compiler
ignores the extra args.
Error: Argument list too long in redeclaration of function '%s'
A prototyped function has been redeclared with a different argument list
Error: Argument list too short %s
Too few parameters have been supplied in a function call.
Error: Argument list too short in redeclaration of function '%s'
A prototyped function has been redeclared with a different argument list
Error: Bit field must be signed or unsigned int
ANSI C requires a bit field to be of one of these types.
If extensions are allowed bit fields can be of any integer
type.
Error: Bit field only allowed on scalar types
Bit fields can only be used on integral types.
This error will occur if in non-ansi mode and you use any
non-integral type as the basis for a bit field.
Error: Bit field too big
Bit fields must fit within the processor word size.
Warning: ('pro') Call to function '%s' with no prototype
A function call has been made to a function that has not been
previously declared. Compiler guesses at argument types.
Error: Cannot cast %s
C++. Some casting of classes is not allowed.
Error: Cannot define a pointer or reference to a reference
C++. Reference variables are treated specially in this regard
Error: Cannot initialize '%s'
An error occurred while trying to process a variable initialization
Error: Cannot modify a const val
a CONST value may not be modified
Error: Cannot open file \"%s\" for read access
An include file was not found
Error: Cannot overload 'main'
C++. main() must not be overloaded
Error: Cannot take address of bit field
Pointers to bit fields not allowed
Error: Cannot use bit field as a non-member
Only structure members may have a bit field qualifier.
Warning: ('cno') Code has no effect
This line of code compiled to nothing
Error: Constant value expected
In general initializers must be constant values. Some others
must as well
Error: Constructor/destructor must be untyped
C++ can't type constructors/destructors
Error: Continue not allowed
Not in scope where a continue makes sense
Warning: ('cnv') Conversion may truncate ignificant digits
An implicit cast may result in loss of significant digits. This
warning is NOT produced for explicit casts.
Error: Could not find a match for '%s'
C++. This function call is not prototyped either directly or
with an overload or defaulted function prototype
Warning: ('dpc') Dangerous pointer cast
If you get this, it will happen when the size of the pointer is
not the same as the size of the (scalar) type youy are using
with it.
Error: Declaration expected
Parser got a statment or other value when it was expecting a
declration.
Error: Declaration not allowed here
Parser found a declaration when it was expecting a statement
Error: Default missing after parameter '%s'
C++... this parameter was assumed to have a default which is missing.
Error: Destructor for class '%s' expected
C++. A destructor was expected.
Error: Duplicate case %d
Two case statements evaluate to the same value
Error: Duplicate label '%s'
The label occurs twice in the same procedure.
Error: Duplicate symbol '%s'
The symbol is being redefined.
Error: Ellipse (...) not allowed in Pascal declarations
Pascal-style declarations may not have variable arguments.
Error: Expected '%c'
The compiler expected a specific character or token.
Error: Expression expected
The compiler was ready to parse an expression but found something
else
Error: File ended with comment in progress
Comments must have an ending point within the same file or
include file.
Error: File name expected in #include directive
#include directive must have a file name
Error: Function declaration not allowed here
A function declaration was attempted in an invalid place, for
example inside a structure or inside another function.
Warning: ('ret') Function should return a value
This error occurs when a function is not of type 'void'
and you exit without returning a value.
Error: Identifier expected
The parser was expecting a variable/function name.
Error: Illegal call to main() from within program
C++. C++ programs may not call main()
Error: Illegal character '%c'
The parser detected an illegal character sequence.
Error: Illegal pointer
An attempt was made to use a non-pointer in a pointer context
Error: Illegal pure declaration syntzx of '%s'
C++. Virtual declaration syntax is wrong
Warning: ('irg') Illegal register var '%s'
the size of the variable was too big for it to fit in a
register
Error: Illegal storage class specifier '%s'
Conflicting or illegal specifier on a declaration.
Error: Illegal storage class specifier on '%s'
Conflicting or illegal specifier on a declaration.
Error: Illegal typedef of '%s'
Attempt to reuse a symbol name as a typedef.
Error: Illegal use of reference operator
Attempt to use '&' in a context where it is not permitted.
Error: Illegal use of void pointer
Cannot take the size of a void pointer.
Error: Inserted '%c'
The parser guessed at a symbol to insert.
Error: Invalid '&' on register var '%s'
Cannot take the address of a register
Error: Invalid floating point
Cannot use floating point in certain types of math functions
(e.g. logic functions)
Error: Invalid preprocessor directive '%s'
Preprocessor directive is unknown
Error: Invalid trap id
CPU-specific. Indicates a cpu operation (int or trap) was called
with an identifier that is too large
Error: '%s' is not a function
Cannot call non-functions.
Error: '%s' is not a label
Cannot jump to non-labels.
Error: Local class functions not supported
C++. Cannot support class definitions as local variables
Error: Local variables may not be used as parameter defaults
C++ .Paremeter defaults must be in scope prior to calling the function
Warning: ('lli') long long int type not supported, defaulting to long int
long long int type will parse correctly but it is unsupported
Error: Lvalue expected
cannot assign to the address of a variable
Error: Macro substitution error
Macro expansions are limited to 4096 characters
Error: Misplaced else
Unexpected else found in input stream
Error: '%s' must be a predefined class or struct
C++. Cannot work with this structure/class because it has not
been fully defined.
Warning: ('zer') No memory allocated for '%s'
an unsized array has no initializers either.
Warning: ('nsf') Nonexistant static func '%s'
a static function was prototyped but never declared
Warning: ('npo') Nonportable pointer conversion
An implicit pointer conversion may result in code that compiles
incorrectly with other C compilers
Error: Non-scalar array index
Array indexes must be of integral type
Error: Numeric constant is too large
an integer or hex constant was too large for the base type,
or the non-fractional part of a floating-point number could
not fit in a long-integer.
Error: Pointer type expected
A pointer was expected.
Warning: '(ieq') Possible incorrect assignment
the symbol '=' was used at the outer scope in an if statement
expression.
This could be intended, but often is a mistype of the symbol
'==' so the compiler warns you.
Warning: ('san') Possible superfluous &
& isn't needed when taking the address of an array. This is a junk
message; ansi C doesn't care either way.
Warning: ('sud') Possible use of '%s' before assignment
A variable has been used but it possibly has not been initialized
with a value
Error: Reference initialization needs lvalue
C++. Reference syntax calls for something whose address can be
taken.
Error: Reference member '%s' in a class with no constructors
C++. The reference variable cannot be initted at class startup because
the constructor is supposed to do it/
Error: Reference variable '%s' must be initialized
C++. Cannot change what a reference variable equates to
at run-time.
Error: Return type is void
Attempt to return a value from a void function
Error: Size is unknown or zero
Attempt to use the size of a variable with a type that has
been forward declared.
Error: Size of '%s' is unknown or zero
Attempt to use the size of a variable with a type that has
been forward declared.
Error: Startup/rundown function '%s' is unknown or not a function
A function named in the '#pragma startup' or '#pragma rundown'
is either not a function or is not defined
Error: String constant too long
A multi-line string is too long.
Warning: ('spc') Suspicious pointer conversion
A pointer operation is being performed on pointers which have
different base types.
Warning: ('fun') Static function '%s' is declared but never used
This static function is just a space waster.
Warning: ('sud') Static variable '%s' is declared but never used
This static variable is just a space waster.
Warning: '(fsu) Structure '%s' is undefined
The compile completed with a structure whose type was never
defined.
Error: Switch argument must be of integral type
Switch arguments must be integers.
Warning: ('tua') Temporary used for parameter %s
C++. A constant was passed in a reference parameter and the compiler
automatically made a variable so the called function would be
happy.
Warning: ('tui') Temporary used to initialize %s
The reference variable is initted with a constant; extra storage
had to be created for it.
Error: Too many initializers
A structure/array has too many initializers
Error: Type expected in sizeof
sizeof argument was not a type or variable
Error: Type mismatch
Generic type mismatch
Error: Type mismatch in arg '%s'
type mismatch for function calls
Error: Type mismatch in redeclaration of '%s'
A variable has been redeclared with a different type from
before.
Error: Type mismatch in return
The value being returned does not match the function type.
Error: Unbalanced preprocessor directives
#if- #endif directives were not balanced.
Warning: Undefined label '%s'
The label should appear somewhere as there is a goto to it.
Error: Undefined symbol '%s'
This is an unknown symbol
Error: Unexpected '%s'
This keyword was unexpected.
Warning: (' urc') Unreachable code
Code stream can never get here.
Warning: ('lun') Unused label '%s'
A label was declared but never used/
Error: User error: %s
#Error directive results in this
Warning: ('sas') Variable '%s' is assigned a value which is never used
After assignment to the var, there is no subsequent use
Error: Variable '%s' cannot have a type qualifier
C++. ???
Error: Variable '%s' is not a class instance
C++. A class instance was expected.
Warning: ('sun') Variable '%s' is declared but never used
This variable was declared but nothing ever referenced it. Space
waster.
6.1) Stack frames
There are a variety of options for stack frames.
a) Standard C-style stack frames. An index register (EBP or A6) is
used to point at a value between the paramenters and the function local
variables; all local variables and function parameters are indexed from
this base register. This is the default.
b) The compiler can free the link register and index all local
variables and function paremeters off the stack pointer
c) (68K only) parameters lists may be located anywhere in memory;
the parameter list pointer is passed in A0. A0 is then transfered to
A6 and parameters are indexed off A6. Meanwhile local variables are
indexed from the stack pointer.
On the 68K, several codegen options are available. By default the
68K compiler generates PIC code based around a 32K memory model. A
68020 mode is available for speciying extended 68020 features such as
enhanced addressing modes and specialized instructions. Another option
is to generate 68000 code in such a way that a data section greater
than 32K can be used. The final option allows one to disable PIC mode
and generate code that will be placed at absolute addresses in memory.
386 code is fairly straightforward. It is a little more complex than
need be because of the need to use special function registers for some
operations like multiplies and shifts. 386 Code will be a little bulky
because of this need.
68K code is position-independent. All global data is accessed off of
register A5 or A6; function arguments are indexed off of A6 or A7; and
the stack is indexed off A7. String constants are indexed off the PC.
Because of this, the total data size may only be 32K unless either the
/C+2 or the /C+L or the /C+A options are used.
6.2) ASM interface
the assembly language program must not modify any registers except the
scratch registers:
386: EAX,ECX,EDX
Parameters are passed on the stack, with the leftmost parameter at the
lowest address.
In all assembly situations it is convenient to use an index register
to index the parameters. The index register must be loaded with the
address of the first parameter (which will be the stack pointer + 4
if you don't push the index register, or the stack pointer +8 if you
do). Parameters normally take four bytes for the standard data types;
however double and long double types take 8 and twelve bytes
respectively. If you pass a structure by value the amount of stack
space used is dependent on the size of the structure.
6.3) Segmentation
The following segments or sections may appear in the output file:
386 name use
.CODE Code and string constants
.DATA Initialized global data
.?DATA Unintialized global data
INITDATA #pragma startup links
EXITDATA #pragma rundown links
CPPDATA C++ static initializations
The following switches affect code generation:
/C-b combine the BSS with the DATA
/C-l don't put line numbers in ASM file
/C-m donn't mangle with underscores
/C+p pack variables
/C+r use reverse order for bit fields.
Note that this option reverses the allocation order
but does not reverse the value in the field.
The following #pragma statements affect code generation:
#pragma regopt - enable/disable register allocations
#pragma startup - name a routine to be executed on startup
#pragma rundown - name a routine to be executed on rundown
6.4) Optimizations
the compiler performs the following optimizations:
a) Constant folding.
When common math is done with constants, the compiler will evaluate
the expression and replace it with a constant.
b) Reduction in strength
multiplies and divides are turned into shifts when appropriate. Mods
are turned into ands when appropriate.
c) Target optimization
When the target for an assignment is known, a temp register will not
be allocated, but the target will be used directly. This keeps us
from generating dead temp registers that will later have to be
optimized out of the icode.
d) Dead code elimination
Delete jumps to jumps, jumps to the next statement, and dead code.
also delete any temporaries that 2) came up with that are now
unused.
Note that the SETJMP libraries for example will NOT save the state of
floating point registers. So there is a switch to disable optimization
into floating point registers in case you need to setjump to a routine
that uses floating point. Address and data register optimizations can
also be turned off. See switches.doc.
e) reordering expressions
In some cases the compiler generates better code if expressions are
reordered; for example:
a = a + 10;
can be turned into a += 10 and better code gets generated. Also,
a lot of work has been put into optimizing usage of based/indexed
modes of the processors when it can be done. The present version
will even use index register scaling when possible!
f) base + index addressing modes
This compiler goes to some length to identify when base + index addressing
modes may be used to generate an address
7.1) Description of directory tree
Sources to this compiler are included in seperate packages.
Sources should be generic; that is they should work on any architecture
where the byte size is 8 bits. However, I use weird tab settings in my
editor. If you want to comprehend the sources get a beautifier of some
sort and run the sources through it first; or set your editor tab
setting to 2 to see what I see.
The directory structure is:
CLIBS
various sources for runn-time library
DOC
documentation
EXAMP
a simple example
(there is a more complex one in clibs\startup\test)
INCLUDE
compiler header files
OBJECT
compiler make/objeect files
SOURCE
compiler source files
There are two groups of sources:
1) the compiler
in the SOURCE, INCLUDE, OBJECT directories
2) libraries you can use in conjunction
with the compiler to gennerate programs (target run-time libraries)
in the CLIBS directory
I often use the set of triple directories:
SOURCE
OBJECT
INCLUDE
for a given project so I won't clutter up a single directory with dozens
of files. When this triple comes up, sources are inn the SOURCE directory,
headers the sources depend on are in the INCLUDE direcorty, and you can
expect me to chdir to the OBJECT directory to compile the program...
thus you will find the make file there.
For these triples you thus have to use an include path which consists of
the INCLUDE directory when you compile the source files.
proto.bat generates the file INCLUDE\CC.P; which is a protootype file
I'm using to keep the compiler honest with me. You shouldn't have
to change that unless you make major changes to the sources... but you can
edit CC.P directly and put new prototypes in if you want. I often do.
I only use proto.bat when I'm making major changes to the compiler.
7.2) Porting
This version of the compiler is intended to be portable; one need only
rewrite the back end for the given target. This portability probably
extends only to processors with a 'byte' architecture.
The following symbols have to be defined on the command line:
-DPROGNAME="CC386" ; Name of the program whnich will
appear in the bannder
-DENVNAME="CC386" ; Name of the environment variable to
consult for command line parameters
-DGLBDEFINE="_i386_" ; Symbol to define in the source; can
be used to identify processor-specific needs
-DSOURCEXT=".ASM" ; Extnension to use on the output file
These definitions are imported by CMAIN.C to define the program
environment. I have shown you the definitions used by the 386
compiler; change them as necessary for your target.
The following files comprise the 386 backend. They should be all you
have to change to port the compiler to a new processor. I suggest you
rename them to something else before changing them:
an386.c - Register optimization
reg386.c - Register allocation for expressions
conf386.c - configuration; int sizes and free registers and such
outas386.c - outputs ASM code
gexpr386.c - turn the expression parse trees into code
gstmt386.c - turn the stmt parse trees into code
peep386.c - Peephole analysis for this processor
For more information on porting contact the author of the code.
David Lindauer (gclind01@starbase.spd.louisville.edu)